30 research outputs found

    On the design of architecture-aware algorithms for emerging applications

    Get PDF
    This dissertation maps various kernels and applications to a spectrum of programming models and architectures and also presents architecture-aware algorithms for different systems. The kernels and applications discussed in this dissertation have widely varying computational characteristics. For example, we consider both dense numerical computations and sparse graph algorithms. This dissertation also covers emerging applications from image processing, complex network analysis, and computational biology. We map these problems to diverse multicore processors and manycore accelerators. We also use new programming models (such as Transactional Memory, MapReduce, and Intel TBB) to address the performance and productivity challenges in the problems. Our experiences highlight the importance of mapping applications to appropriate programming models and architectures. We also find several limitations of current system software and architectures and directions to improve those. The discussion focuses on system software and architectural support for nested irregular parallelism, Transactional Memory, and hybrid data transfer mechanisms. We believe that the complexity of parallel programming can be significantly reduced via collaborative efforts among researchers and practitioners from different domains. This dissertation participates in the efforts by providing benchmarks and suggestions to improve system software and architectures.Ph.D.Committee Chair: Bader, David; Committee Member: Hong, Bo; Committee Member: Riley, George; Committee Member: Vuduc, Richard; Committee Member: Wills, Scot

    Rec-DCM-Eigen: Reconstructing a Less Parsimonious but More Accurate Tree in Shorter Time

    Get PDF
    Maximum parsimony (MP) methods aim to reconstruct the phylogeny of extant species by finding the most parsimonious evolutionary scenario using the species' genome data. MP methods are considered to be accurate, but they are also computationally expensive especially for a large number of species. Several disk-covering methods (DCMs), which decompose the input species to multiple overlapping subgroups (or disks), have been proposed to solve the problem in a divide-and-conquer way

    Optimizing JPEG2000 Still Image Encoding on the Cell Broadband Engine

    No full text
    JPEG2000 is the latest still image coding standard from the JPEG committee, which adopts new algorithms such as Embedded Block Coding with Optimized Truncation (EBCOT) and Discrete Wavelet Transform (DWT). These algorithms enable superior coding performance over JPEG and support various new features at the cost of the increased computational complexity. The Sony-Toshiba-IBM Cell Broadband Engine (or the Cell/B.E.) is a heterogeneous multicore architecture with SIMD accelerators. In this work, we optimize the computationally intensive algorithmic kernels of JPEG2000 for the Cell/B.E. and also introduce a novel data decomposition scheme to achieve high performance with low programming complexity. We compare the Cell/B.E.’s performance to the performance of the Intel Pentium IV 3.2 GHz processor. The Cell/B.E. demonstrates 3.2 times higher performance for lossless encoding and 2.7 times higher performance for lossy encoding. For the DWT, the Cell/B.E. outperforms the Pentium IV processor by 9.1 times for the lossless case and 15 times for the lossy case. We also provide the experimental results on one IBM QS20 blade with two Cell/B.E. chips and the performance comparison with the existing JPEG2000 encoder for the Cell/B.E

    Engineering and Analyzing Multicellular Systems: Methods and Protocols

    No full text
    Abstract Mathematical modeling and computer simulation are important tools for understanding complex interactions between cells and their biotic and abiotic environment: similarities and differences between modeled and observed behavior provide the basis for hypothesis formation. Momeni et al. (Elife 2:e00230, 2013) investigated pattern formation in communities of yeast strains engaging in different types of ecological interactions, comparing the predictions of mathematical modeling, and simulation to actual patterns observed in wet-lab experiments. However, simulations of millions of cells in a three-dimensional community are extremely time consuming. One simulation run in MATLAB may take a week or longer, inhibiting exploration of the vast space of parameter combinations and assumptions. Improving the speed, scale, and accuracy of such simulations facilitates hypothesis formation and expedites discovery. Biocellion is a highperformance software framework for accelerating discrete agent-based simulation of biological systems with millions to trillions of cells. Simulations of comparable scale and accuracy to those taking a week of computer time using MATLAB require just hours using Biocellion on a multicore workstation. Biocellion further accelerates large scale, high resolution simulations using cluster computers by partitioning the work to run on multiple compute nodes. Biocellion targets computational biologists who have mathematical modeling backgrounds and basic C++ programming skills. This chapter describes the necessary steps to adapt the original Momeni et al.'s model to the Biocellion framework as a case study
    corecore